AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Chinese Image Understanding

# Chinese Image Understanding

Qwen2.5 VL 7B Captioner Relaxed GGUF
Apache-2.0
Qwen2.5-VL-7B-Captioner-Relaxed is a multimodal vision-language model based on the Qwen2.5 architecture, focusing on image-to-text generation tasks.
Image-to-Text English
Q
samgreen
320
1
Mmalaya2
Apache-2.0
A multimodal model fine-tuned based on InternVL-Chat-V1-5, excelling in MMBench benchmark tests
Image-to-Text
M
DataCanvas
26
2
Moondream1
A 1.6B-parameter multimodal model combining SigLIP and Phi-1.5 architectures, supporting image understanding and Q&A tasks
Image-to-Text Transformers English
M
vikhyatk
70.48k
487
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase